Contextual Analysis for Text Recognition: A Comparison with Human Performance
نویسنده
چکیده
A common approach to the recognition of text (handwritten or machineprinted) is to generate a set of candidate words for each unknown word in the input. The resulting sequence of word candidate sets may then be further disambiguated by applying contextual knowledge in the form of a language model. This paper describes language models derived from large text corpora and demonstrates the performance improvements obtained. In addition, it describes an experiment in which word candidate sets were presented to human subjects for manual disambiguation. The error rates so produced provide an independent, quantitative measure of the difficulty of this task. Moreover, a second trial in which the subjects were provided with domain information produced significantly reduced error rates. This result suggests that the effective use of topic area information can make a valuable contribution to the text recognition process.
منابع مشابه
Emotion Detection in Persian Text; A Machine Learning Model
This study aimed to develop a computational model for recognition of emotion in Persian text as a supervised machine learning problem. We considered Pluthchik emotion model as supervised learning criteria and Support Vector Machine (SVM) as baseline classifier. We also used NRC lexicon and contextual features as training data and components of the model. One hundred selected texts including pol...
متن کاملThe Impact of Contextual Clue Selection on Inference
Linguistic information can be conveyed in the form of speech and written text, but it is the content of the message that is ultimately essential for higher-level processes in language comprehension, such as making inferences and associations between text information and knowledge about the world. Linguistically, inference is the shovel that allows receivers to dig meaning out from the text with...
متن کاملContext modeling for text/non-text separation in free-form online handwritten documents
Free-form online handwritten documents contain a high diversity of content, organized without constraints imposed to the user. The lack of prior knowledge about content and layout makes the modeling of contextual information of crucial importance for interpretation of such documents. In this work, we present a comprehensive investigation of the sources of contextual information that can benefit...
متن کاملGuided Text Spotting for Assistive Blind Navigation in Unfamiliar Indoor Environments
Scene text in indoor environments usually preserves and communicates important contextual information which can significantly enhance the independent travel of blind and visually impaired people. In this paper, we present an assistive text spotting navigation system based on an RGB-D mobile device for blind or severely visually impaired people. Specifically, a novel spatial-temporal text locali...
متن کاملA Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features
Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...
متن کامل